NSF-ITR/IM PROJECT: 2004 Abstracts From Bits to Information: Statistical Learning Technologies for Digital Information Management Search
نویسنده
چکیده
Project Title: Term Informativeness PI: T. Jaakkola Participants: Jason Rennie and Tommi Jaakkola (MIT CSAIL) Abstract: Informal communication (e-mail, bulletin boards) poses a difficult learning environment because traditional grammatical and lexical information are noisy. For named entity extraction how topic-centric, or “informative,” a word is can provide valuable additional information. We introduce a new informativeness score based on mixture models for the task of extracting restaurant names from bulletin board posts. By combining the mixture score with IDF, we are able to achieve significant gains on a restaurant extraction task. We also motivate and discuss a Bayesian version of the score which would better capture the variability in term occurrence rates.
منابع مشابه
NSF-ITR/IM PROJECT: 2001 Abstracts From Bits to Information: Statistical Learning Technologies for Digital Information Management Search
Project Title: Polycategorical Categorization for Personalized Information Filtering PI: T. Hofmann Participants: Ioannis Tsochandaritis and Thomas Hofmann Abstract: Polycategorical categorization is an extension of standard classification in which items are labeled by multiple binary labels. We are particularly interested in cases with large numbers of overlapping categories and a priori unkno...
متن کاملNSF-ITR/IM PROJECT: 2002 Abstracts From Bits to Information: Statistical Learning Technologies for Digital Information Management Search
Project Title: Support Vector Machines for Multiple Instance Learning PI: T. Hofmann Participants: Stuart Andrews and Thomas Hofmann Abstract: Multiple Instance Learning (MIL) is an important generalization of standard supervised binary classification. In MIL labels are not available for individual training patterns, but are associated with sets of patterns, which introduces additional uncertai...
متن کاملNSF-ITR/IM PROJECT From Bits to Information: Statistical Learning Technologies for Digital Information Management Search
Project Title: Polycategorical Categorization for Personalized Information Filtering PI: T. Hofmann Participants: Ioannis Tsochandaritis and Thomas Hofmann Abstract: Polycategorical categorization is an extension of standard classification in which items are labeled by multiple binary labels. We are particularly interested in cases with large numbers of overlapping categories and a priori unkno...
متن کاملNSF-ITR/IM PROJECT From Bits to Information: Statistical Learning Technologies for Digital Information Management Search
Project Title: Polycategorical Categorization for Personalized Information Filtering PI: T. Hofmann Participants: Ioannis Tsochandaritis and Thomas Hofmann Abstract: Polycategorical categorization is an extension of standard classification in which items are labeled by multiple binary labels. We are particularly interested in cases with large numbers of overlapping categories and a priori unkno...
متن کاملکاربرد رایانههای جیبی و تلفنهای هوشمند در دسترسی به اطلاعات سلامت
Background and Aim: Today, one of the challenges of doctors is how they can access medical information as quick as possible. Personal Digital Assistants (PDAs) and Smartphones are such information technologies that can be used to access health information. This study aimed to review the most important uses of Personal Digital Assistants and Smartphones in medicine and in accessing health inform...
متن کامل